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Abstract — In a distributed storage systems (DSS), regenerating 
codes are used to optimize bandwidth in the repair process 
of a failed node. To optimize other DSS parameters such as 
computation and disk I/O, Distributed Replication-based Simple 
Storage (Dress) Codes consisting of an inner Fractional Repetition 
(FR) code and an outer MDS code are commonly used. Thus 
constructing FR codes is an important research problem, and 
several constructions using graphs and designs have been pro- 
posed. In this paper, we present an algorithm for constructing the 
node-packet distribution matrix of FR codes and thus enumerate 
some FR codes up to a given number of nodes n. We also present 
algorithms for constructing regular graphs which give rise to FR 
codes. 

I. Introduction 

The emerging era of cloud computing poses new challenges 
for researchers to provide reliable and secure data storage. 
Practical systems for distributed storage include the Hadoop 
based system |1| used in Facebook and Windows Azure 
storage . In these distributed storage systems (DSSs), data is 
stored on n unreliable nodes. Reliability is provided either by 
replicating the data or using erasure MDS (Maximum Distance 
Separable) codes. Both of these schemes have drawbacks either 
in terms of bandwidth, complexity or disk I/O. To overcome 
these limitations, regenerating codes were introduced by Di- 
makis et al. [3|, and subsequently studied by many researchers 
0, H, 0, 0, Q, 0, 0. A node failure in such systems 
can be handled by regenerating the data stored on that node 
using its peers. This regeneration can be functional or exact. 
Functional repair allows restoration of the data such that a 
stored file can be retrieved by contacting any k out of n nodes, 
where k < n. Exact repair allows for the creation of a replica 
of the data previously stored on the node [4|, |9 |. Regenerating 
codes are specified by the parameters {[n, k, d], [a, /3, B}}, 
where n is the number of nodes, k is the number of nodes 
that need to be contacted to recover a file B, and d is the 
repair degree (the number of nodes that must be contacted to 
regenerate data in case of a node failure). The capacity of a 
node is given by a, and the repair bandwidth for each of the 
d nodes is /3, so the total repair bandwidth is d/3 0. 

The tradeoff in Regenerating codes between the storage 
capacity and repair bandwidth have given rise to two new 
classes of codes, namely Minimum Storage Regenerating 
(MSR) codes and Minimum Bandwidth Regenerating (MBR) 
codes. MBR codes employ exact and uncoded data repair. 
Uncoded repair means that a particular set of d nodes, as 
listed in the Repair Table of the node, are contacted and one 
data packet is downloaded from each, thus reducing the repair 
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Fig. 1. A DRESS code consisting of an inner fractional repetition code C 
having n = 5 nodes, 8 = 10 packets, replication factor p = 2, and repair 
degree d = 4, and an outer MDS code. 



complexity. MBR codes are formed by the concatenation of 
an outer MDS code and an inner Fractional Repetition (FR) 
code. The MDS code maintains the MDS property of the DSS, 
while the Fractional Repetition codes allow for an uncoded 
repair process. These concatenated codes are known as DRESS 
codes (Distributed Replication-based Exact Simple Storage) 
codes 0, 0. Many constructions of Fractional Repetition 
Codes (and hence DRESS codes), are known based on bipartite 
graph [ 10 1, resolvable designs [ 1 1 J, regular graphs 0, IIT2I and 
other structures 0. 

In this paper, an algorithm for the construction of Fractional 
Repetition (FR) codes is presented which is based on the 
incidence matrix of the node-packet distribution. Algorithms 
are also given for the construction of regular graphs. The rest of 
the paper is organized as follows. In Section 2, the basics of FR 
codes and the incidence matrix of the node-packet distribution 
are given. Section 3 presents an algorithm for the construction 
of the n x 8 incidence matrix of the node-packet distribution 
of an FR code. Algorithms for constructing regular graphs and 
hence FR codes for n = 9 are presented in Section 4. Finally, 
Section 5 concludes the paper with some general remarks. 

II. Background 

Distributed Replication-based Simple Storage (DRESS) 
codes consist of an inner Fractional Repetition (FR) code and 
an outer MDS code, as shown in Figure [T] FR codes are 
formally defined in Definition [T] 

Defenition 1. (Fractional Repetition Code): A Fractional Rep- 
etition (FR) code C, with repetition degree p, for an (n,k,d) 
DSS, is a collection C of n subsets Ui, U2, ■ ■ ■ , U n of a set 



TABLE I. Node-Packet Distribution Incidence Matrix M of 
Size 5 x 10 for the FR Code C : (5, 10, 4, 2) Shown in Figure[7J 
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The Number of Possible FR Codes for n ■■ 
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Q = {1, . . . , 9}, each having size d, i.e, \Ui\ — d, satisfying 
the condition that each element of Q belongs to exactly p sets 
in the collection. The code is denoted by C : (n,9,d, p), and 
the parameters of C are related by nd = p8. 

Example 2. Figure [JJ gives an example of FR code. Suppose 
there are 9 packets from ¥ q (a finite field with q elements), 
and 5 storage nodes. Using an MDS code, the 9 packets are 
first encoded into 10 packets such that the last packet is the 
parity packet. Next all 10 packets are replicated twice (p — 2), 
on the 5 nodes according to the arrangement of the FR code 
C : (5,10,4,2) in the figure. This code can tolerate 1 failure 
and the data can be recovered by contacting 4 nodes, hence 
the repair degree is 4. 

Remark 3. An FR code C : (n, 9, d, p) can also be character- 
ized by a node-packet distribution incidence matrix M of size 
n x 9 with row weight d and column weight p. For example, 
the incidence matrix for the FR code C : (5, 10, 4, 2) shown 
in Figure [7J is given by Table |7j The row weight is 4 and the 
column weight is 2. 

A. Equivalence of Fractional Repetition codes 

Two Fractional Repetition codes Cj : (rii, #i, d%, px) and 
C2 : (n>2) #2, d2, p 2 ) are said to be equivalent if 

1) The number of nodes and the number of packets in 
the system are same, i.e., n\ = n 2 and 9% = 9 2 . 
Hence the dimension of the corresponding incidence 
matrices is the same, i.e., n\ x 9i — n 2 X 9 2 . 

2) The repair degree and the replication factor are the 
same, i.e., d\ — d 2 and p\ = p 2 . Hence the 
corresponding incidence matrices have the same row 
weight d and column weigh p. 

3) The same packet distribution can be achieved by 
simply renaming the packets of one of the codes 
i.e., if the incidnece matrix of Ci can be obtained 
by applying permutations on the rows and columns 
of the incidence matrix of C2. 

Remark 4. An incidence matrix of dimension n x 9 defines 
an FR code with n nodes and 9 packets. The repair degree is 
d, and the replication factor is p. Now, taking the transpose of 
this matrix gives a matrix of dimension 9 x n. The weight of 
each row is now p, and the weight of each column is d. This 
new matrix also satisfies the conditions for an FR code, and 
corresponds to a code with 9 nodes, n packets, repair degree 
p, and replication factor d. 

III. Enumeration of Fractional Repetition Codes 
using Incidence Matrices 

To enumerate the FR codes for a given n, the replication 
factor p can be varied in the range 2 < p < n — 1, and the 
repair degree d in the range 2 < d < n — 1. In each case, 
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9 can be determined using nd = p9 and the corresponding 
incidence matrix M of size n x 9 can be filled such that the 
weight of each row is d and the weight of each column is 
p to obtain an FR code. Algorithm [T] is given below to fill 
the incidence matrix with l's and 0^. Table |n] summarizes 
the number of possible FR codes up to length n = 10. For 
larger n, the data can be obtained from http://www.ece.uvic. 
c a/~ agulli ve/mani sh/Li s t . html 



Algorithm 1 Generate a node-packet distribution incidence 
matrix M of size n x 9 

Require: n, d, 9, p and an all zero matrix M of size n x 8 
Ensure: M nx g such that weight(row[M]) = d 
and weight(column[M]) = p 

1 : Place a 1 in d positions of the I s * row from left to right 
starting from mn and move to the 2 nd row. 

2 : In the row, place a 1 in the first column j, 2 < j < 9 for 
which the column weight is < p. 

3 : Compute the weight of all consecutive columns from 
j + 1 to 6. If the minimum weight of these columns is the 
same, go to Step 4, otherwise place l's in increasing order 
of weight until weight(row) = d or the last column is 
reached. Go to Step 6 

4 : Traversing rows from the top, identify the first row 
having an entry 1 which corresponds to a 1 in the j th column 
(determined in Step 2), in the current row. 

5 : Traversing consecutive columns from j + 1 to 9 in the 
current row, place a 1 in the column for which a first 
occurs in the row identified in Step 4. 

6 : If weight(row) < d, go to Step 2 otherwise move to 
Step 7 

7 : If a next row exists, move to that row and go to Step 2, 
otherwise Stop. 



Example 5. For n = 6,d = 4,6 = 8 and p = 3, Algorithm [7J 
gives the following incidence matrix 

1 1 1 1 0' 

1 1 1 1 

1 1 1 1 

1110 10 

1 1 1 1 

1 1 1 1 



M ex8 = 



This matrix gives the FR code C : (6,8,4,3) as shown in 
Figure [2] 

IV. Construction of Regular Graphs 

FR codes can be generated using a regular graphs of 
degree d J9), lfT2ll . Therefore, Algorithm [2] is presented for 
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Fig. 2. The FR code C : (6,8,4,3) generated using the incidence matrix 
in Example 5. 



generating regular graphs. We also present Algorithms [3] and 
[4] for constructing regular graphs based on the approach of 
filling the incidence matrix to obtain an FR code. To the best 
of our knowledge, this solution has not been reported in the 
vast literature on regular graphs. An example is given for 
each algorithm. The proposed algorithms are constrained to 
nd € 2Z + and p = 2. Note that a regular graph of degree d is 
a graph where every vertex has the same degree d, which is 
possible only for nd £ 2Z + . 

Algorithm 2 Regular Graph for nd e 2Z + , p = 2 and d < 
n-1 

1 : Divide the n vertices into two set of vertices, 
U{u 1 ,u 2 ,...,u [ n i } and V{«i, v 2 , ■ ■ ■ , Vra -| } 

2 : Construct two cyclic graphs G\ : (U,Ei) and G 2 ■ 
(V,E 2 ), with Gi enclosing G 2 

if n is odd then 

Select two vertices Vi and Vj such that edge {vi, Vj} ^ E 2 

Add edge {vi, Vj} 
end if 

Select vertices Ui € U and Vj € V such that deg(vj) ^ d 
Add edge {uj, Vj} Repeat for vertex U; until deg(wi) = [|J 

Select vertices Ui, Uj £ U, such that edge {u,-, u^} ^ i?i 
Add edgejiti, m 3 } 

Repeat for vertex u.j until deg(uj) = [|] 
Pick vertex Wj, ^ g y, such that edge {vi, Vj} ^ £2 
Add edge {v{, vj} 

Repeat for vertex Vi until deg(ui) = [5] 

Example 6. Algorithm |2] can foe Mserf fo generate a d-regular 
graph for p = 2 one/ faj n = 4, d = 2 (b) n = 8, d = A 
(c) n = 16, d = 8, as shown in Figure^ and a FR code as 
shown in Figure |4] 

An adjacency matrix is a matrix depicting the relationship 
between vertices, showing whether they are connected or not. 
FR codes can be represented by graphs, where the vertices 
represent the nodes and the edges represent the packets. These 
can be interchanged, thus making edges the nodes and vertices 
the packets. Now for nd even, and p = 2, graphs can be 
represented by an adjacency matrix of dimensions nxn. This 
matrix acts as a basis for generating the incidence matrix of 
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Fig. 3. A regular graph for p = 2 and (a) n = 4, d = 2 (b) n = 8, d = 4 
(c) n = 16, d = 8. The vertices of the graphs are nodes, and the edges 
originating from them are the packets stored in those nodes. Thus Algorithm 
[2] generates a d-regular graph which depicts the packet distribution among the 
nodes as shown in Figure [4] 
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= 4, 9 = 4, d = 2, p = 2 (b) n 



9 = 16, d = 4, p = 2. This distribution shows that any two nodes have either 
no packet or 1 packet in common. 



the graph. The incidence matrix shows the packet distribution 
over the n nodes. We present two algorithms to generate the 
adjacency matrix for parameters p, n, d, 8, with constraints 
p=2 and nd £ 2Z + , where n — 8, row weight d and column 
weight p. This provides an FR code with the same number of 
nodes and packets. 

Algorithm 3 Adjacency Matrix A of Size n X n 

Require: n, d, 8, p and a null matrix A of size n x n 
Ensure: A nxn such that weight(row[A]) = d 
and weight(column[A]) = p 

1 : Set an — 1 and fill the consecutive entries of first row 
with [d - 1) l's from left to right 

2 : Set the first column as the transpose of the first row 

3 : Move right to left by filling l's such that weight of i th 
row is d 

4 : Take transpose of the i th row and fill the i th column 

5 : Increase i by one 

6 : Go to Step 4, if i < n. 



Example 7. The adjacency matrix for n — 6, d 
p = 3 generated by Algorithm [i] is 



4 = 8, 
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Algorithm 4 Adjacency Matrix A of Size n x n 
Require: n, d, 8, p and a null matrix A of size n x n 
Ensure: A nxn such that weight(row[A]) = d 
and weight(column[A\) = p 

1 : For 1 < i < n and j = n to 1 

2 : Update A[i][j] and A[j][i] to 1 (i ^ j) such that weight 
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Example 8. The adjacency matrix for n — 6, d 
p = 4 generated by Algorithm [5] is 



^-6x6 — 



4, 9 = 6, 
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V. Conclusion 

In this paper, several algorithms have been presented for 
constructing FR codes. Algorithm [T] is a general construction 
technique which for any value of n calculates the possible val- 
ues of d, p and 8, and then generates the corresponding node- 
packet matrices. The complexity of the algorithm is 0(n 3 ). 
The algorithm has been tested for values up to n = 100, 
and the results have been recorded. This data is available 
from http://www.ece.uvic.ca/~agullive/manish/List.html Our 
aim was to generate a common data storage pattern for any 
given set of parameters. The algorithm generates a node-packet 
matrix for each possible value of d, p and 8 for a range of n. 
New algorithms were also presented for constructing regular 
graphs. 



